Performance evaluation of phonotactic and contextual onset-rhyme models for speech recognition of Thai language

نویسندگان

Somchai Jitapunkul

Ekkarit Maneenoi

Visarut Ahkuputra

Sudaporn Luksaneeyanawin

چکیده

This paper proposed two acoustic modelings of the onsetrhyme for speech recognition. The two models are Phonotactic Onset-Rhyme Model (PORM) and Contextual Onset-Rhyme Model (CORM). The models comprise a pair of onset and rhyme units, which makes up a syllable. An onset comprises an initial consonant and its transition towards the following vowel. Together with the onset, the rhyme consists of a steady vowel portion and a final consonant. The experiments have been carried out to find the proper acoustic model, which can accurately model Thai sound and gives higher accuracy. Experimental results show that the onset-rhyme model excels the efficiency of the triphone for both PORM and CORM. The PORM achieves higher syllable accuracy than the CORM 2.74 %. Moreover the onset-rhyme models also give a more efficiency in term of system complexity compared to the triphone models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards High Performance Phonotactic Feature for Spoken Language Recognition

With the demands of globalization, multilingual speech is increasingly common in conversational telephone speech, broadcast news and internet podcasts. Therefore, automatic spoken language recognition has become an important technology in multilingual speech related applications. For example, automatic spoken language recognition has been used as a preprocessing component for spoken language tr...

متن کامل

The multimodal nature of spoken word processing in the visual world: Testing the predictions of alternative models of multimodal integration

Ambiguity in natural language is ubiquitous (Piantadosi, Tily & Gibson, 2012), yet spoken communication is effective due to integration of information carried in the speech signal with information available in the surrounding multimodal landscape. However, current cognitive models of spoken word recognition and comprehension are underspecified with respect to when and how multimodal information...

متن کامل

Modeling code-Switching speech on under-resourced languages for language identification

This paper presents an integration of phonotactic information to perform language identification (LID) in a mixed-language speech. A single-pass front-end recognition system is employed to convert the spoken utterances into a statistical occurrence of phone sequences. To process such phone sequences, a hidden Markov model (HMM) is utilized to build robust acoustic models that can handle multipl...

متن کامل

The Dynamics of Spoken Word Recognition in Second Language Listeners Reveal Native-Like Lexical Processing

Models of spoken word recognition in monolingual, native listeners account for the dynamics of lexical activation of intended words and their phonologically similar “competitors,” in terms of continuous, cascaded processing dynamics. Here we explore how the dynamics of spoken word recognition differ for second language listeners. Groups of native Korean speakers (KL1) and native English speaker...

متن کامل

Homogenous ensemble phonotactic language recognition based on SVM supervector reconstruction

Currently, acoustic spoken language recognition (SLR) and phonotactic SLR systems are widely used language recognition systems. To achieve better performance, researchers combine multiple subsystems with the results often much better than a single SLR system. Phonotactic SLR subsystems may vary in the acoustic features vectors or include multiple language-specific phone recognizers and differen...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Performance evaluation of phonotactic and contextual onset-rhyme models for speech recognition of Thai language

نویسندگان

چکیده

منابع مشابه

Towards High Performance Phonotactic Feature for Spoken Language Recognition

The multimodal nature of spoken word processing in the visual world: Testing the predictions of alternative models of multimodal integration

Modeling code-Switching speech on under-resourced languages for language identification

The Dynamics of Spoken Word Recognition in Second Language Listeners Reveal Native-Like Lexical Processing

Homogenous ensemble phonotactic language recognition based on SVM supervector reconstruction

عنوان ژورنال:

اشتراک گذاری